The picasso Package for Nonconvex Regularized M-estimation in High Dimensions in R
نویسندگان
چکیده
We describe an R package named picasso, which implements a unified framework of pathwise coordinate optimization for a variety of sparse learning problems (Sparse Linear Regression, Sparse Logistic Regression and Sparse Column Inverse Operator), combined with distinct active set identification schemes (truncated cyclic, greedy, randomized and proximal gradient selection). Besides, the package provides the choices between convex (`1 norm) and nonvoncex (MCP and SCAD) regularizations. These methods provide a broad range of options of different sparsity inducing regularizations for most commonly used regression approaches, and various schemes of active set identification allow for the trade-off between statistical consistency and computational efficiency. Moreover, picasso has a provable linear convergence to a unique sparse local optimum with optimal statistical properties, which the competing packages (e.g., ncvreg) do not have. The package is coded in C and can scale up to large problems efficiently with the memory optimized via the sparse matrix output.
منابع مشابه
The picasso Package for High Dimensional Regularized Sparse Learning in R
We introduce an R package named picasso, which implements a unified framework of pathwise coordinate optimization for a variety of sparse learning problems (Sparse Linear Regression, Sparse Logistic Regression and Sparse Poisson Regression), combined with efficient active set selection strategies. Besides, the package allows users to choose different sparsityinducing regularizers, including the...
متن کاملRegularized Autoregressive Multiple Frequency Estimation
The paper addresses a problem of tracking multiple number of frequencies using Regularized Autoregressive (RAR) approximation. The RAR procedure allows to decrease approximation bias, comparing to other AR-based frequency detection methods, while still providing competitive variance of sample estimates. We show that the RAR estimates of multiple periodicities are consistent in probabilit...
متن کاملEstimation of exposure to fine particulate air pollution using GIS-based modeling approach in an urban area in Tehran
In many industrialized areas, the highest concentration of particulate matter, as a major concern on public health, is being felt worldwide problem. Since the air pollution assessment and its evaluation with considering spatial dispersion analysis because of various factors are complex, in this paper, GIS-based modeling approach was utilized to zoning PM2.5 dispersion over Tehran, du...
متن کاملRetrieving Three Dimensional Displacements of InSAR Through Regularized Least Squares Variance Component Estimation
Measuring the 3D displacement fields provide essential information regarding the Earth crust interaction and the mantle rheology. The interferometric synthetic aperture radar (InSAR) has an appropriate capability in revealing the displacements of the Earth’s crust. Although, it measures the real 3D displacements in the line of sight (LOS) direction. The 3D displacement vectors can be retrieved ...
متن کاملOn Quadratic Convergence of DC Proximal Newton Algorithm for Nonconvex Sparse Learning in High Dimensions
We propose a DC proximal Newton algorithm for solving nonconvex regularized sparse learning problems in high dimensions. Our proposed algorithm integrates the proximal Newton algorithm with multi-stage convex relaxation based on difference of convex (DC) programming, and enjoys both strong computational and statistical guarantees. Specifically, by leveraging a sophisticated characterization of ...
متن کامل